Bounded-Parameter Partially Observable Markov Decision Processes
نویسندگان
چکیده
The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-life situations, due to various reasons such as limited data for learning the model, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter partially observable Markov decision processes (BPOMDPs). A modified value iteration is proposed as a basic strategy for tackling parameter imprecision in BPOMDPs. In addition, we design the UL-based value iteration algorithm, in which each value backup is based on two sets of vectors called Uset and L-set. We propose four typical strategies for setting U-set and L-set, and some of them guarantee that the modified value iteration is implemented through the algorithm. We analyze theoretically the computational complexity and the reward loss of the algorithm. The effectiveness and robustness of the algorithm are revealed by empirical studies.
منابع مشابه
Permissive Finite-State Controllers of POMDPs using Parameter Synthesis
We study finite-state controllers (FSCs) for partially observable Markov decision processes (POMDPs). The key insight is that computing (randomized) FSCs on POMDPs is equivalent to synthesis for parametric Markov chains (pMCs). This correspondence enables using parameter synthesis techniques to compute FSCs for POMDPs in a black-box fashion. We investigate how typical restrictions on parameter ...
متن کاملA POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems
Maintenance can be the factor of either increasing or decreasing system's availability, so it is valuable work to evaluate a maintenance policy from cost and availability point of view, simultaneously and according to decision maker's priorities. This study proposes a Partially Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating syste...
متن کاملStructured Parameter Elicitation
The behavior of a complex system often depends on parameters whose values are unknown in advance. To operate effectively, an autonomous agent must actively gather information on the parameter values while progressing towards its goal. We call this problem parameter elicitation. Partially observable Markov decision processes (POMDPs) provide a principled framework for such uncertainty planning t...
متن کاملProducing efficient error-bounded solutions for transition independent decentralized mdps
There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading ...
متن کاملPoint Based Value Iteration with Optimal Belief Compression for Dec-POMDPs
We present four major results towards solving decentralized partially observable Markov decision problems (DecPOMDPs) culminating in an algorithm that outperforms all existing algorithms on all but one standard infinite-horizon benchmark problems. (1) We give an integer program that solves collaborative Bayesian games (CBGs). The program is notable because its linear relaxation is very often in...
متن کامل